Threshold Learning for Optimal Decision Making

نویسنده

  • Nathan F. Lepora
چکیده

Decision making under uncertainty is commonly modelled as a process of competitive stochastic evidence accumulation to threshold (the drift-diffusion model). However, it is unknown how animals learn these decision thresholds. We examine threshold learning by constructing a reward function that averages over many trials to Wald’s cost function that defines decision optimality. These rewards are highly stochastic and hence challenging to optimize, which we address in two ways: first, a simple two-factor reward-modulated learning rule derived from Williams’ REINFORCE method for neural networks; and second, Bayesian optimization of the reward function with a Gaussian process. Bayesian optimization converges in fewer trials than REINFORCE but is slower computationally with greater variance. The REINFORCE method is also a better model of acquisition behaviour in animals and a similar learning rule has been proposed for modelling basal ganglia function.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying and Ranking Comprehensive Decision-Making Criteria for Determination of Optimal Gas Monetization Strategy Using Fuzzy Delphi Method

The development and exploitation of the countrychr('39')s gas fields requires huge investments that will require appropriate monetization strategies to offset these costs and generate revenue for the government. This study describes the most important gas monetization strategies, including pipeline, LNG, GTL, petrochemical, CNG, GTW and GTS, and emphasizes the need for utilize the appropriate s...

متن کامل

An Iterative Decision Rule to minimize cost of Acceptance Sampling Plan in Machine Replacement Problem

In this paper, we presented an optimal iterative decision rule for minimizing total cost in designing a sampling plan for machine replacement problem using the approach of dynamic programming and Bayesian inferences. Cost of replacing the machine and cost of defectives produced by machine has been considered in model. Concept of control threshold policy has been applied for decision making. If ...

متن کامل

Bayesian Sequential Detection With Phase-Distributed Change Time and Nonlinear Penalty—A POMDP Lattice Programming Approach

We show that the optimal decision policy for several types of Bayesian sequential detection problems has a threshold switching curve structure on the space of posterior distributions. This is established by using lattice programming and stochastic orders in a partially observed Markov decision process (POMDP) framework. A stochastic gradient algorithm is presented to estimate the optimal linear...

متن کامل

Bayesian Sequential Detection with Phase-Distributed Change Time and Nonlinear Penalty -- A POMDP Approach

We show that the optimal decision policy for several types of Bayesian sequential detection problems has a threshold switching curve structure on the space of posterior distributions. This is established by using lattice programming and stochastic orders in a partially observed Markov decision process (POMDP) framework. A stochastic gradient algorithm is presented to estimate the optimal linear...

متن کامل

Optimization of Brain Tumor MR Image Classification Accuracy Using Optimal Threshold, PCA and Training ANFIS with Different Repetitions

Background: One of the leading causes of death is brain tumors. Accurate tumor classification leads to appropriate decision making and providing the most efficient treatment to the patients. This study aims to optimize brain tumor MR images classification accuracy using optimal threshold, PCA and training Adaptive Neuro Fuzzy Inference System (ANFIS) with different repetitions.Material and Meth...

متن کامل

A new machine replacement policy based on number of defective items and Markov chains

  A novel optimal single machine replacement policy using a single as well as a two-stage decision making process is proposed based on the quality of items produced. In a stage of this policy, if the number of defective items in a sample of produced items is more than an upper threshold, the machine is replaced. However, the machine is not replaced if the number of defective items is less than ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016